Learning part of speech disambiguation rules using Inductive Logic Programming

نویسندگان

  • Nikolaj Lindberg
  • Martin Eineborg
چکیده

A pilot study on inducing rules for part of speech tagging of unrestricted Swedish text is reported. Using the Progol machine-learning system, Constraint Grammar inspired rules were learnt from the part of speech tagged Stockholm-Ume a Corpus. Several thousand disambiguation rules discarding faulty readings of ambiguously tagged words were induced. When tested on unseen data, 97% of the words retained the correct reading after tagging. However, there were still ambiguities in the output after applying the tagging rules | on an average, 1.15 tags/word.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Constraint Grammar-style Disambiguation Rules using Inductive Logic Programming

This paper reports a pilot study, in which Constraint Grammar inspired rules were learnt using the Progol machine-learning system. Rules discarding faulty readings of ambiguously tagged words were learnt for the part of speech tags of the Stockholm-Ume£ Corpus. Several thousand disambiguation rules were induced. When tested on unseen data, 98% of the words retained the correct reading after tag...

متن کامل

NP chunking using ILP

This is to report the results of approaching the problem of NP chunking using Inductive Logic Programming techniques. The problem, as de-ned in (Ramshaw and Marcus, 1995), is the machine learning of rules that recognise non-recursive, base NPs in text annotated with part-of-speech tags, by tagging each word as beingìnside' oròutside' an NP. (Consecutive NPs are appropriately treated.) The same ...

متن کامل

Unsupervised Learning of Disambiguation Rules for Part of Speech Tagging

In this paper we describe an unsupervised learning algorithm for automatically training a rule-based part of speech tagger without using a manually tagged corpus. We compare this algorithm to the Baum-Welch algorithm, used for unsupervised training of stochastic taggers. Next, we show a method for combining unsupervised and supervised rule-based training algorithms to create a highly accurate t...

متن کامل

Learning Expressive Models for Word Sense Disambiguation

We present a novel approach to the word sense disambiguation problem which makes use of corpus-based evidence combined with background knowledge. Employing an inductive logic programming algorithm, the approach generates expressive disambiguation rules which exploit several knowledge sources and can also model relations between them. The approach is evaluated in two tasks: identification of the...

متن کامل

Mining Association Rules in Multiple Relations

The application of algorithms for eeciently generating association rules is so far restricted to cases where information is put together in a single relation. We describe how this restriction can be overcome through the combination of the available algorithms with standard techniques from the eld of inductive logic programming. We present the system Warmr, which extends Apriori 2] to mine assoc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007